Study of Various Decision Tree Pruning Methods with their Empirical Comparison in WEKA
نویسندگان
چکیده
Classification is important problem in data mining. Given a data set, classifier generates meaningful description for each class. Decision trees are most effective and widely used classification methods. There are several algorithms for induction of decision trees. These trees are first induced and then prune subtrees with subsequent pruning phase to improve accuracy and prevent overfitting. In this paper, various pruning methods are discussed with their features and also effectiveness of pruning is evaluated. Accuracy is measured for diabetes and glass dataset with various pruning factors. The experiments are shown for this two datasets for measuring accuracy and size of the tree. General Terms Classification, Data Mining
منابع مشابه
مطالعات درخت تصمیم در برآورد ریسک ابتلا به سرطان سینه با استفاده از چند شکلیهای تک نوکلوئیدی
Abstract Introduction: Decision tree is the data mining tools to collect, accurate prediction and sift information from massive amounts of data that are used widely in the field of computational biology and bioinformatics. In bioinformatics can be predict on diseases, including breast cancer. The use of genomic data including single nucleotide polymorphisms is a very important ...
متن کاملAn Empirical Comparison of Supervised Learning Algorithms in Disease Detection
In this paper empirical comparison is carried out with various supervised algorithms. We studied the performance criterion of the machine learning tools such as Naïve Bayes, Support vector machines, Radial basis neural networks, Decision trees J48 and simple CART in detecting diseases. We used both binary and multi class data sets namely WBC, WDBC, Pima Indians Diabetes database and Breast tiss...
متن کاملAn Empirical Comparison of Pruning Methods for Ensemble Classifiers
Many researchers have shown that ensemble methods such as Boosting and Bagging improve the accuracy of classification. Boosting and Bagging perform well with unstable learning algorithms such as neural networks or decision trees. Pruning decision tree classifiers is intended to make trees simpler and more comprehensible and avoid over-fitting. However it is known that pruning individual classif...
متن کاملComparison of decision tree methods for finding active objects
The automated classification of objects from large catalogues or survey projects is an important task in many astronomical surveys. Faced with various classification algorithms, astronomers should select the method according to their requirements. Here we describe several kinds of decision trees for finding active objects by multiwavelength data, such as REPTree, Random Tree, Decision Stump, Ra...
متن کاملAppears in Ecml-98 as a Research Note a Longer Version Is Available as Ece Tr 98-3, Purdue University Pruning Decision Trees with Misclassiication Costs 1 Pruning Decision Trees
We describe an experimental study of pruning methods for decision tree classiiers when the goal is minimizing loss rather than error. In addition to two common methods for error minimization, CART's cost-complexity pruning and C4.5's error-based pruning, we study the extension of cost-complexity pruning to loss and one pruning variant based on the Laplace correction. We perform an empirical com...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2012